Search CORE

283 research outputs found

Evaluating The Fairness Of The Undergraduate Supports Survey: A DIF Analysis Of Gender And Year-In-School

Author: DOUGLAS Kerrie
GENTRY Adrian Nat
HOLLOWAY Eric
LI Tiantian
MARTIN Julie
Publication venue: Technological University Dublin
Publication date: 10/10/2023
Field of study

Technologies for Reusing Text from the Web

Author: Potthast Martin (Dr. rer. nat.)
Publication venue
Publication date: 17/02/2012
Field of study

Texts from the web can be reused individually or in large quantities. The former is called text reuse and the latter language reuse. We first present a comprehensive overview of the different ways in which text and language is reused today, and how exactly information retrieval technologies can be applied in this respect. The remainder of the thesis then deals with specific retrieval tasks. In general, our contributions consist of models and algorithms, their evaluation, and for that purpose, large-scale corpus construction. The thesis divides into two parts. The first part introduces technologies for text reuse detection, and our contributions are as follows: (1) A unified view of projecting-based and embedding-based fingerprinting for near-duplicate detection and the first time evaluation of fingerprint algorithms on Wikipedia revision histories as a new, large-scale corpus of near-duplicates. (2) A new retrieval model for the quantification of cross-language text similarity, which gets by without parallel corpora. We have evaluated the model in comparison to other models on many different pairs of languages. (3) An evaluation framework for text reuse and particularly plagiarism detectors, which consists of tailored detection performance measures and a large-scale corpus of automatically generated and manually written plagiarism cases. The latter have been obtained via crowdsourcing. This framework has been successfully applied to evaluate many different state-of-the-art plagiarism detection approaches within three international evaluation competitions. The second part introduces technologies that solve three retrieval tasks based on language reuse, and our contributions are as follows: (4) A new model for the comparison of textual and non-textual web items across media, which exploits web comments as a source of information about the topic of an item. In this connection, we identify web comments as a largely neglected information source and introduce the rationale of comment retrieval. (5) Two new algorithms for query segmentation, which exploit web n-grams and Wikipedia as a means of discerning the user intent of a keyword query. Moreover, we crowdsource a new corpus for the evaluation of query segmentation which surpasses existing corpora by two orders of magnitude. (6) A new writing assistance tool called Netspeak, which is a search engine for commonly used language. Netspeak indexes the web in the form of web n-grams as a source of writing examples and implements a wildcard query processor on top of it.Texte aus dem Web können einzeln oder in großen Mengen wiederverwendet werden. Ersteres wird Textwiederverwendung und letzteres Sprachwiederverwendung genannt. Zunächst geben wir einen ausführlichen Überblick darüber, auf welche Weise Text und Sprache heutzutage wiederverwendet und wie Technologien des Information Retrieval in diesem Zusammenhang angewendet werden können. In der übrigen Arbeit werden dann spezifische Retrievalaufgaben behandelt. Unsere Beiträge bestehen dabei aus Modellen und Algorithmen, ihrer empirischen Auswertung und der Konstruktion von großen Korpora hierfür. Die Dissertation ist in zwei Teile gegliedert. Im ersten Teil präsentieren wir Technologien zur Erkennung von Textwiederverwendungen und leisten folgende Beiträge: (1) Ein Überblick über projektionsbasierte- und einbettungsbasierte Fingerprinting-Verfahren für die Erkennung nahezu identischer Texte, sowie die erstmalige Evaluierung einer Reihe solcher Verfahren auf den Revisionshistorien der Wikipedia. (2) Ein neues Modell zum sprachübergreifenden, inhaltlichen Vergleich von Texten. Das Modell basiert auf einem mehrsprachigen Korpus bestehend aus Pärchen themenverwandter Texte, wie zum Beispiel der Wikipedia. Wir vergleichen das Modell in mehreren Sprachen mit herkömmlichen Modellen. (3) Eine Evaluierungsumgebung für Algorithmen zur Plagiaterkennung. Die Umgebung besteht aus Maßen, die die Güte der Erkennung eines Algorithmus' quantifizieren, und einem großen Korpus von Plagiaten. Die Plagiate wurden automatisch generiert sowie mit Hilfe von Crowdsourcing manuell erstellt. Darüber hinaus haben wir zwei Workshops veranstaltet, in denen unsere Evaluierungsumgebung erfolgreich zur Evaluierung aktueller Plagiaterkennungsalgorithmen eingesetzt wurde. Im zweiten Teil präsentieren wir auf Sprachwiederverwendung basierende Technologien für drei verschiedene Retrievalaufgaben und leisten folgende Beiträge: (4) Ein neues Modell zum medienübergreifenden, inhaltlichen Vergleich von Objekten aus dem Web. Das Modell basiert auf der Auswertung der zu einem Objekt vorliegenden Kommentare. In diesem Zusammenhang identifizieren wir Webkommentare als eine in der Forschung bislang vernachlässigte Informationsquelle und stellen die Grundlagen des Kommentarretrievals vor. (5) Zwei neue Algorithmen zur Segmentierung von Websuchanfragen. Die Algorithmen nutzen Web n-Gramme sowie Wikipedia, um die Intention des Suchenden in einer Suchanfrage festzustellen. Darüber hinaus haben wir mittels Crowdsourcing ein neues Evaluierungskorpus erstellt, das zwei Größenordnungen größer ist als bisherige Korpora. (6) Eine neuartige Suchmaschine, genannt Netspeak, die die Suche nach gebräuchlicher Sprache ermöglicht. Netspeak indiziert das Web als Quelle für gebräuchliche Sprache in der Form von n-Grammen und implementiert eine Wildcardsuche darauf

Online-Publikationssystem der Bauhaus-Universität Weimar

Assessing the viability of estimating baleen whale abundance from tourist vessels

Author: Biuw Martin
Henderson Angus Fleetwood
Hindell Mark Andrew
Kelly Nat
Lea Mary-Anne
Lowther Andrew
Wotherspoon Simon
Publication venue
Publication date: 01/01/2023
Field of study

Many populations of southern hemisphere baleen whales are recovering and are again becoming dominant consumers in the Southern Ocean. Key to understanding the present and future role of baleen whales in Southern Ocean ecosystems is determining their abundance on foraging grounds. Distance sampling is the standard method for estimating baleen whale abundance but requires specific logistic requirements which are rarely achieved in the remote Southern Ocean. We explore the potential use of tourist vessel-based sampling as a cost-effective solution for conducting distance sampling surveys for baleen whales in the Southern Ocean. We used a dataset of tourist vessel locations from the southwest Atlantic sector of the Southern Ocean and published knowledge from Southern Ocean sighting surveys to determine the number of tourist vessel voyages required for robust abundance estimates. Second, we simulated the abundance and distributions of four baleen whale species for the study area and sampled them with both standardized line transect surveys and non-standardized tourist vessel-based surveys, then compared modeled abundance and distributions from each survey to the original simulation. For the southwest Atlantic, we show that 12-22 tourist vessel voyages are likely required to estimate abundance for humpback and fin whales, with relative estimates for blue, sei, Antarctic minke, and southern right whales. Second, we show tourist vessel-based surveys outperformed standardized line transect surveys at reproducing simulated baleen whale abundances and distribution. These analyses suggest tourist vessel-based surveys are a viable method for estimating baleen whale abundance in remote regions. For the southwest Atlantic, the relatively cost-effective nature of tourist vessel-based survey and regularity of tourist vessel voyages could allow for annual and intra-annual estimates of abundance, a fundamental improvement on current methods, which may capture spatiotemporal trends in baleen whale movements on forging grounds. Comparative modeling of sampling methods provided insights into the behavior of general additive model-based abundance modeling, contributing to the development of detailed guidelines of best practices for these approaches. Through successful engagement with tourist company partners, this method has the potential to characterize abundance across a variety of marine species and spaces globally, and deliver high-quality scientific outcomes relevant to management organizations.publishedVersio

Brage IMR

Immersive design engineering

Author: Lee Chang Hee
Martin Nat
Sommer Bjorn
Torrisi Savina
Publication venue: 'Society for Imaging Science & Technology'
Publication date: 26/01/2020
Field of study

Design Engineering is an innovative field that usually combines a number of disciplines, such as material science, mechanics, electronics, and/or biochemistry, etc. New immersive technologies, such as Virtual Reality (VR) and Augmented Reality (AR), are currently in the process of being widely adapted in various engineering fields. It is a proven fact that the modeling of spatial structures is supported by immersive exploration. But the field of Design Engineering reaches beyond standard engineering tasks. With this review paper we want to achieve the following: define the term “Immersive Design Engineering”, discuss a number of recent immersive technologies in this context, and provide an inspiring overview of work that belongs to, or is related to the field of Immersive Design Engineering. Finally, the paper concludes with definitions of research questions as well as a number of suggestions for future developments

Royal College of Art Research Repository

The abuse, neglect and mistreatment of older people in care homes and hospitals in England: observations on the potential for secondary data analysis

Author: Heath Hazel
Hussein Shereen
Lievesley Nat
Manthorpe Jill
Stevens Martin
Publication venue: King's College London
Publication date: 10/01/2011
Field of study

This study has investigated what sources of data exist on the subject of elder abuse in care home and hospital settings in England. It was commissioned by the Department of Health and Comic Relief. We used a broad definition of elder abuse to cover mistreatment, neglect and abuse. Some of these subjects are criminal offences; others are contrary to professional codes, service standards or breaches of human rights. Defining elder abuse is not easy, as the recent study of definitions produced for this programme of research (PANICOA) confirms (Dixon et al 2009). The main part of this study involved ‘desk research’ – an exploration of what data is collected, why, by whom and about what. In addition, a set of interviews was undertaken with people who collect and analyse information on this subject and those who make use of such information to uphold older people’s rights. We found that data are scarce and limited, definitions and collection are unsystematised centrally and locally, and currently demand collation from various and disparate sources

Kent Academic Repository

King's Research Portal

A systematic scoping review evaluating sugar-sweetened beverage taxation from a systems perspective

Author: Abdool Karim Safura
Adams Jean
Alvarado Miriam
Carters-White Lauren
Egan Nat
Murphy Madhuvanti M.
Penney Tarra
Rogers Nina Trivedy
White Martin
Publication venue
Publication date: 19/10/2023
Field of study

Edinburgh Research Explorer

Recommended from our members

Understanding the Changing Cultural Value of the BBC World Service and the British Council

Author: Bell Simon
Fisher Ali
Foster Tot
Gillespie Marie
Lvov Ilia
Macfarlane Jess
Martin Nat
Smith Andrew W. M.
Voss Alex
Webb Alban
Wilding Colin
Publication venue: Arts and Humanities Research Council
Publication date: 01/07/2014
Field of study

This project investigated the changing cultural value of the BBC World Service (WS) and the British Council (BC) and how their cultural value can be assessed and measured. For eight decades, these organisations have been the face and voice of Britain overseas. Our research found that their attraction and influence abroad remains strong, but is on the wane, reflecting the UK’s declining economic and political significance on the world stage. Among the key findings of our historical and contemporary research: Cultural value is the catalyst of all aspects of value at WS and BC, founded on their capacity to act as transcultural intermediaries, fostering international understanding, and setting benchmarks in global standards for journalism and cultural relations work. Cultural value is relational, never independent of political and economic value. It is perspectival: audiences trust the quality and credibility of outputs; high professional standards and prestige benefit staff; funders appreciate the diplomatic and soft power assets. Cultural value accrues slowly over time but can be quickly lost. Social media afford new ways of connecting, informing and engaging citizens at home and abroad. Our case studies analysing the uses of Twitter and Facebook by BC and WS around global media events underscore the so far limited role of social media in democratising participation and promoting intercultural dialogue. We developed an innovative, theoretically grounded and empirically informed Cultural Value Model (CVM). This is an innovative device for conceptualising, analysing and assessing value in a multidimensional, composite, visual way. The CVM is designed for planning, monitoring and evaluating projects and organisations over time, alongside existing performance indicators and impact measures. It is currently being tested and developed on further projects at WS and BC as well as at the Swedish Institute

Open Research Online (The Open University)

Einfluss der Lagerungsbedingungen auf das Gasbildungsverhalten von flüssigen Gärrestanteilen zur landwirtschaftlichen Nutzung

Author: Deneke Martin (Prof. Dr. rer. nat.)
Gehrke Tobias (M.Sc.)
Möller Jan (Dipl.-Ing.)
Publication venue
Publication date: 01/01/2014
Field of study

Mit Einführung der Deponieverordnung 2005 endete die Deponierung nicht-inerter Stoffe und biologisch aktiver Spezies. Im Zuge ihrer Durchsetzung traten neue Behandlungsmöglichkeiten dieser Stoffströme in das Zentrum der Betrachtung. Zu diesen Behandlungsmethoden zählt die anaerobe biologische Behandlung. Sie findet vor allem bei stark organisch geprägten Stoffströmen mit hohen Wassergehalten Anwendung (Kaltschmitt et al. 2009). Im Zuge der anaeroben Behandlung werden organische Kohlenstoffverbindungen hauptsächlich in CO2 und CH4 umgesetzt. Die anaerobe Behandlung reduziert so den organischen Anteil und das Volumen des Gärsubstrates. Im Gegenzug steigt der Wasseranteil des Gärsubstrates durch die Freisetzung von Zellwasser an. Verbindungen aus Phosphor und Stickstoff verbleiben im Gärsubstrat und machen es damit zu einem geeigneten Düngemittel für die Landwirt-schaft (Möller et al. 2009). Das Material unterliegt der Düngemittelverordnung, durch welche sowohl die Ausbringung als auch die Ansprüche an dessen Beschaffenheit geregelt werden. Um den anfallenden Gärrest für die Landwirtschaft nutzen zu können, muss daher eine Möglichkeit zur mittelfristigen Lagerung geschaffen werden (Raussen et al. 2008). Über das Verhalten im Zeitraum dieser Lagerung gibt es bisher nur wenige Untersuchungen. Grund hierfür ist vor allem die hohe Diversität des Presswassers. Dessen Eigenschaften werden durch die Betriebsparameter der Biogasanlage bestimmt und unterscheiden sich so deutlich von Anlage zu Anlage. Die hier dargestellten Arbeiten sind Teil eines Forschungsvorhabens zur Untersuchung des Lagerungsverhaltens von Presswasser aus der Vergärungs- und Kompostierungsanlage Leppe. Diese verfügt über zwei Lagerungsbehälter mit einem Fassungsvermögen von je 3.500 m³. Bereits im Zuge der Planung wurde ein anaerober und aerober Betrieb vorgesehen. Neben den drei Rührwerken, mit denen die Durchmischung des Presswassers gewährleistet wird, sind zwei Injektionsbelüfter installiert. Ziel der Untersuchungen war es, eine Datengrundlage zu schaffen, aufgrund derer über die Betriebsweise der Presswasserspeicher entschieden werden kann

ePublications

Fine-tuning language models to find agreement among humans with diverse preferences

Author: Aslanides John
Bakker Michiel A.
Balaguer Jan
Botvinick Matthew M.
Campbell-Gillingham Lucy
Chadwick Martin J.
Glaese Amelia
McAleese Nat
Sheahan Hannah R.
Summerfield Christopher
Tessler Michael Henry
Publication venue
Publication date: 27/11/2022
Field of study

Recent work in large language modeling (LLMs) has used fine-tuning to align outputs with the preferences of a prototypical user. This work assumes that human preferences are static and homogeneous across individuals, so that aligning to a a single "generic" user will confer more general alignment. Here, we embrace the heterogeneity of human preferences to consider a different challenge: how might a machine help people with diverse views find agreement? We fine-tune a 70 billion parameter LLM to generate statements that maximize the expected approval for a group of people with potentially diverse opinions. Human participants provide written opinions on thousands of questions touching on moral and political issues (e.g., "should we raise taxes on the rich?"), and rate the LLM's generated candidate consensus statements for agreement and quality. A reward model is then trained to predict individual preferences, enabling it to quantify and rank consensus statements in terms of their appeal to the overall group, defined according to different aggregation (social welfare) functions. The model produces consensus statements that are preferred by human users over those from prompted LLMs (>70%) and significantly outperforms a tight fine-tuned baseline that lacks the final ranking step. Further, our best model's consensus statements are preferred over the best human-generated opinions (>65%). We find that when we silently constructed consensus statements from only a subset of group members, those who were excluded were more likely to dissent, revealing the sensitivity of the consensus to individual contributions. These results highlight the potential to use LLMs to help groups of humans align their values with one another

arXiv.org e-Print Archive

Recital: Students\u27 Recital

Author: Clark Ida
Hill Oakley
Kemmerer Martin
Koch Dorothea
Law Mina
Lewis Lew
Nissley Martha
Olmstead Rosalie
Paltrowitz Marion
Roman Joseph
Rosenthal Nat
Watkins Doris
Winkelman Marion
Zimmerman Blanche
Publication venue: Digital Commons IC
Publication date: 11/03/1930
Field of study

Ithaca College